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(54) Integrated vehicle voice enhancement system and hands-free cellular telephone system 



(57) An integrated vehicle voice enhancement sys- 
tem and hands-free cellular telephone system imple- 
ments microphone steering techniques and noise 
reduction filtering to improve the intelligibility and clarity, 
of transmitted signals. A microphone steering switch is 
provided for the cellular telephone interface which 
allows only one of the microphones to be switched in to 
an "on" state at any given time. The microphone steer- 
ing switch generates a raw telephone input switch that is 
a combination of 100% of the designated primary micro- 
phone signal and approximately 20% of the microphone 
signals from microphones in the "off" state. In this man- 
ner, the telephone line does not appear dead to a lis- 
tener on the other end of the telephone line when 
speech is not present in the telephone input signal. A 
noise reduction filter filters the raw telephone signal in 
the time domain in real time to improve the clarity of the 
telephone input signal when speech is present in the tel- 
ephone input signal. A microphone steering switch for 
the voice enhancement system is also provided to 
implement switching between acoustically coupled 
microphones located within the vehicle. 
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Description 

FIELD OF THE INVENTION 

s [0001 ] The invention relates to vehicle voice enhancement systems and hands-free cellular telephone systems using 
microphones mounted throughout a vehicle to sense driver and/or passenger speech. In particular, the invention relates 
to improvements in the selection of transmitted microphone signals and noise reduction filtering. 

BACKGROUND OF THE INVENTION 

10 

[0002] A vehicle voice enhancement system uses intercom systems to facilitate conversations of passengers sitting 
within different zones of a vehicle. A single channel voice enhancement system has a near-end zone and a far-end zone 
with one speaking location in each zone. A near-end microphone senses speech in the near-end zone and transmits a 
voice signal to a far-end loudspeaker. The far-end loudspeaker outputs the voice signal into the far-end zone, thereby 

15 enhancing the ability of a driver and/or passenger in the far-end zone to listen to speech occurring in the near-end zone 
even though there may be substantial background noise within the vehicle. Likewise, a far-end microphone senses 
speech in the far-end zone and transmits a voice signal to a near-end loudspeaker that outputs the voice signal into the 
near-end zone. Voice enhancement systems not only amplify the voice signal, but also bring an acoustic source of the 
voice signal closer to the listener. 

20 [0003] Microphones are typically mounted within the vehicle near the usual speaking locations, such as on the ceiling 
of the vehicle passenger compartment above the seats or on seat belt shoulder harnesses. Inasmuch as microphones 
are present when implementing a vehicle voice enhancement system, it is desirable to use the voice enhancement sys- 
tem microphones in combination with a cellular telephone system to provide a hands-free cellular telephone system 
within the vehicle. 

25 [0004] It is important that an integrated voice enhancement system and hands-free cellular telephone system be able 
to transmit dear intelligible voice signals. This can be difficult in a vehicle because significant acoustic changes can 
occur quickly within the passenger compartment of the vehicle. For instance, background noise can change substan- 
tially depending on the environment around the vehicle, the speed of the vehicle, etc. Also, the acoustic plant within the 
passenger compartment can change substantially depending upon temperature within the vehicle and/or the number 

so of passengers within the vehicle, etc. Adaptive acoustic echo cancellation as disclosed in U.S. Patent Nos. 5,033,082 
and 5,602,928 and pending U.S. patent application No. Serial No. 08/626,208, can be used to effectively model various 
acoustic characteristics within the passenger compartment to remove annoying echoes. However, even after annoying 
echoes are removed, background noise within the vehicle passenger compartment can distort voice signals. Further, 
microphone switching can create unnatural speech patterns and annoying clicking noises. 

35 [0005] Providing intelligible and natural sounding voice signals is important for voice enhancement systems, and is 
also important for hands-free cellular telephone systems. However, providing intelligible and natural sounding voice sig- 
nals is typically more difficult for cellular telephone systems. This is because a listener on the other end of the line must 
be able to not only clearly hear speech from the vehicle but also must be able to easily detect whether the cellular tel- 
ephone is on-line. That is, the line must not appear dead to the listeners when no speech is present in the vehicle. Also, 

40 the listener on the other end of the line is typically in a quiet environment and the presence of background vehicle noises 
during speech is annoying. 

SUMMARY OF THE INVENTION 

45 [0006] The invention is an integrated vehicle voice enhancement system and hands-free cellular telephone system 
that implements a voice activated microphone steering technique to provide intelligible and natural sounding voice sig- 
nals for both the voice enhancement aspects of the system and the hands-free cellular telephone aspects of the sys- 
tem. This invention arose during continuing development efforts relating to the subject matter of U.S. Patent Nos. 
5,033,082; 5,602,928; 5,172,416; and copending U.S. patent application Serial No. 08/626,208 entitled "Acoustic Echo 

so Cancellation In An Integrated Audio and Telecommunication Intercom System"), all incorporated herein by reference. 
The invention applies to both single channel (SISO) and multiple channel (MIMO) systems. 

[0007] In one aspect the invention involves the use of a microphone steering switch that inputs echo-cancelled voice 
signals from the microphones within the vehicle and outputs a raw telephone input signal. Each of the microphones in 
the system has the capability of switching between an "off" state and an "on" state. The microphones are voice activated 
55 such that a respective microphone can switch into the "on" state only when the sound level in the microphone signal 
(e.g. dB) exceeds a threshold switching value, thus indicating that speech is present in a speaking location near the 
microphone. The microphone steering switch outputs a raw telephone input signal which is preferably a combination of 
100% of the microphone output from the microphone in the "on" state, and preferably approximately 20% of the micro- 



2 



EP 0 932 142 A2 



phone output from the microphone(s) in the "off state. In order for the telephone input signal to be intelligible by a per- 
son on the other end of the cellular telephone line, the invention allows only one of the microphones to be designated 
as the primary microphone (i.e. switched to the "on" state) at any given time. 

[0008] The invention implements microphone steering techniques for the designation of primary microphone signals 
5 into the "on" state so that no two microphones are switched into the "on" state at the same time. Yet, microphone output 
between the "on" and "off" states fades out and cross-fades between microphones in a manner that is not annoying to 
the driver and/a passengers within the vehicle or a person on the other end of the cellular telephone line. 
[0009] When generating the raw telephone input signal, it is desirable that a rather high percentage of the microphone 
output for the microphones in the "off" state, for example approximately 20%, be transmitted so that the cellular tele- 
10 phone line does not appear dead to a person on the other end of the telephone line when speech is not present within 
the vehicle. 

[001 0] In a second aspect, the invention applies noise reduction filters to filter out the background vehicle noise in the 
system microphone signals. In a microphone steering context, it is designed to remove the noise in the signals corre- 
sponding to the microphone(s) in the "on" state. The noise reduction filters are important for three primary reasons: 

15 

1 . They generate a noise-reduced telephone input signal having improved clarity. By properly steering and switch- 
ing the microphone signals, an intelligible raw telephone input signal is derived from the set of system microphone 
signals. However, this signal also contains a relatively large amount of background noise which in many cases 
severely degrades the quality of the speech signal, especially to a listener in a quiet environment on the other end 

20 of the line. 

2. They reduce the background noise that is rebroadcasted to the system loudspeakers in both SISO and MIMO 
voice enhancement systems. The rebroadcast of the background noise is very perceivable in situations where the 
noise characteristics spatially vary within the vehicle. This is common in large vehicles where the amount of wind 
noise (i.e. open/closed window or sunroof), HVAC/fan noise, road noise, etc. vary depending on the passenger's 

25 position in the vehicle. 

3. For vehicles employing voice recognition systems (for example, those that are used to interpret hands-free cel- 
lular phone commands), the background noise on the microphone signai(s) can severely degrade the performance 
of such systems. The noise reduction f ilter(s) reduce the background noise and therefore improve the performance 
of the voice recognition. 

c 30 

[001 1 ] In its most general state, the noise reduction filters are applied to each of the microphone signals after the echo 
has been subtracted. However, if processing power is limited on the electronic controller, a single noise reduction filter 
can be applied to the microphone steering switch output to remove the background noise in the outgoing cell phone sig- 
nal. 

35 [0012] The preferred noise reduction filter includes a bank of fixed filters, preferably spanning the audible frequency 
spectrum, and a time-varying filter gain element p m corresponding to each fixed filter. The raw input signal inputs each 
of the fixed filters, and the output of each fixed filter z m (k) is weighted by the respective time-varying filter gain element 
p m . A summer combines the weighted and filtered input signals and outputs a noise-reduced input signal. The preferred 
noise reduction fitters process the raw input signal in real time in the time domain. Therefore, the need for inverse trans- 

40 forms which are computationally burdensome is eliminated. The time-varying filter gain elements are preferably 
adjusted in accordance with a speech strength level for the output of each respective fixed filter. In this manner, the 
noise reduction filter tracks the sound characteristics of speech present in the raw input signal over time, and gives 
emphasis to bands containing speech, while at the same time fading out background noise occurring within bands in 
which speech is not present However, if no speech at all is present in the raw input signal, the noise reduction filter will 

45 allow sufficient signal to pass therethrough so that the cellular telephone line does not appear dead to someone on the 
other end of the line. 

[0013] The preferred transform is a recursive implementation of a discrete cosine transform modified to stabilize its 
performance on digital signal processors. The preferred transform (i.e. Equations 1 and 2) has several important prop- 
erties that make it attractive for this invention. First the preferred transform is a completely real valued transform and 

so therefore does not introduce complex arithmetic into the calculations as with the discrete Fourier transform (DFT). This 
reduces both the complexity and the storage requirements. Second, this transform can be efficiently implemented in a 
recursive fashion using an IIR filter representation. This implementation is very efficient which is extremely important for 
voice enhancement systems where the electronic controllers are burdened with the other echo-cancellation tasks. 
[0014] It should be noted that the preferred transform (i.e. Equations 1 and 2) has two major advancements over the 

55 traditional recursive-type of transforms mentioned in the literature. Traditional recursive-type of transforms, including the 
"sliding" DFT transform, often suffer from filter instability problems. This instability is the result of roundoff errors which 
arise when the filter parameters are implemented in the finite precision environment of a digital signal processor (DSP). 
More precisely, the instability is due to non-exact cancellation of the "marginally" stable poles of the filter which is 
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caused by the parameter round-off errors. The preferred transform presented here is designed to overoome these prob- 
lems by modifying the filter parameters according to a y facta. This stabilizes the filter and is well suited for a variety of 
hardware systems since y can be adjusted to accommodate different fixed or floating-point digital signal processors. 
Another advancement of the preferred transform over the conventional transforms is that each of the filters in the pre- 
ferred transform is appropriately scaled such that the summation of all of the filter outputs, z m (k): m=0...M-1, at any 
instant in time equals the input at that instant in time. Thus, the combining of the outputs acts as an inverse transform. 
Therefore, an explicit inverse transform is not required. This further increases the efficiency of the transformation. 
[001 5] The time-varying gain elements, p m applied to the filtered input signals also have several major improvements 
over the existing approaches. It should be noted that the performance of the system lies solely in the proper calculation 
of the gain elements p m since with unity gain elements the system output is equal to the input signal resulting in no 
noise reduction. Existing techniques often suffer from poor speech quality. This results from the filter's inability to adjust 
to rapidly varying speech giving the processed speech a "choppy" sound characteristic. The approach taken here over- 
comes this problem by adjusting the time-varying gain elements p m in a frequency-dependent manner to ensure a fast 
overall dynamic response of the system. The p m gains corresponding to high frequency bands are determined accord- 
ing to speech strength level computed from a relatively small number of fitter output samples, z m (k), since high fre- 
quency signals vary quickly with time and therefore fewer outputs are needed to accurately estimate the output power. 
On the other hand, the p m gains corresponding to low frequency bands are computed from a larger number of filter out- 
put samples in order to accurately measure the power of low frequency signals which are slowly time-varying. By deter- 
mining the p m gains in this frequency band-dependent fashion, each band in the filter is optimized to provide the fastest 
temporal response while maintaining accurate power estimates. If the system p m gains for the bands were determined 
in the same manner or by using the same formula, as is common in existing methods, the dynamic response of the high 
frequency bands would be compromised to achieve accurate low power estimates. Furthermore, this approach uses a 
closed-form expression for the p m gain based on the speech strength levels in each band, and therefore does not 
require a table of gain elements to be stored in memory. This expression also has been derived such that when speech 
levels are low in a particular frequency band, the p m gain of the band is not set to zero, but some low level value. This 
is important so that the cell phone input does not appear "dead" to the listener at the other end of the line, and it also 
significantly reduces signal llutter". 

[001 6] In another aspect, the invention implements microphone steering switches for multiple channel voice enhance- 
ment systems. For instance, such a MIMO voice enhancement system typically has two or more microphones in a near- 
end acoustic zone and two or more microphones in a far-end acoustic zone. While the microphones in the near-end 
zone are typically not acoustically coupled to the microphones in the far-end zone, microphones within the near-end 
zone may be acoustically coupled to one another and microphones within the far-end zone may be acoustically coupled 
to one another. In implementing the MIMO voice enhancement system, it is desirable that only one of the microphones 
in the near-end zone be designated as a primary microphone (i.e. switched into the "on" state) at any given time in order 
for the transmitted input signal to the far-end zone to be intelligible. This is important not only when two or more pas- 
sengers within the vehicle are speaking, but also to prevent acoustic spill over from one speaking location in the near- 
end zone to another speaking location in the near-end zone which could cause microphone falsing. Preferably, a similar 
steering switch is provided to generate a transmitted near-end input signal from the far-end microphone signals. In 
implementing the steering switches for the voice enhancement system, it is preferred that microphones in the "off" state 
contribute a small percentage of the microphone output, such as 5%-10% or less, so that transmission of background 
noise through the voice enhancement system is not noticeable by the driver and/or passengers within the vehicle. It is 
desirable that a small undetectable percentage of the microphone output be contributed to the respective input signal 
to prevent annoying microphone clicking that would occur if the microphone switches electrically between being oh and 
being completely off. 

BRIEF DESCRIPTION OF THE DRAWINGS 
[0017] 

Fig. 1 is a schematic illustration of an integrated vehicle voice enhancement system and hands-free cellular tele- 
phone system. 

Figs. 2A and 2B are graphs illustrating voice activated switching in accordance with the invention. 
Fig. 3A is a block diagram illustrating the operation of an integrated single channel vehicle voice enhancement sys- 
tem and hands-free cellular telephone system in accordance with the invention, which uses a single noise reduc- 
tion filter. 

Fig. 3B is a block diagram illustrating the operation of an integrated single channel vehicle voice enhancement sys- 
tem and hands-free cellular telephone system in accordance with the invention, which uses a plurality of noise 
reduction filters. 
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Fig. 4 is a state diagram illustrating a preferred microphone steering technique. 

Fig. 5 is a plot illustrating the designation of one of the microphones in the system as a primary microphone, thus 
switching the designated primary microphone from an "off state to an "on" state. 

Figs. 6A and 6B are plots illustrating cross-fading from a first primary microphone to a second primary microphone. 
5 Fig. 7 is a plot illustrating fade^ut of a primary microphone from an "on" state to an "off" state. 

Fig. 8A is a schematic drawing illustrating the preferred manner of noise reduction filtering for the cellular telephone 
input signal. 

Figs. 88, 8C and 8D are schematic block diagrams showing the preferred transforms implemented in the noise 
reduction filter shown in Fig. 8A. 

io Fig. 9A is a block diagram illustrating an integrated multiple channel vehicle voice enhancement system and hands- 
free cellular telephone system in accordance with the invention, which uses a single noise reduction filter. 
Fig. 9B is a block diagram illustrating an integrated multiple channel vehicle voice enhancement system and hands- 
free cellular telephone system in accordance with the invention, which uses a plurality of noise reduction filters. 
Fig. 10 is a state diagram illustrating a preferred microphone steering technique for a telephone steering switch 

is shown in Fig. 9. 

Fig. 1 1 is a state diagram illustrating a preferred microphone steering technique for voice enhancement steering 
switches shown in Fig. 9. 

DETAILED DESCRIPTION OF THE DRAWINGS 

20 

[0018] Fig. 1 illustrates an integrated vehicle voice enhancement system and hands-free cellular telephone system 
10 in accordance with the invention. The system 10 has a near-end zone 12 and a far-end zone 14, both residing within 
a vehicle 15. Each zone 12 and 14 may be subject to substantial background noises. Thus, a passenger in the vehicle 
seated in the far-end zone 14 may have difficulty hearing a passenger and/or driver located in the near-end zone 12 
25 without the use of a vehicle voice enhancement system, or vice-versa. In addition to implementing a voice enhance- 
ment system, it may be desirable to use active sound control or the like to reduce background noises within the vehicle 
15. 

[0019] In Fig. 1, the near-end zone 12 includes two speaking locations 16 and 18, respectively. Afirst near-end micro- 
phone 20 senses noise and speech at speaking location 16. A second near-end microphone 22 senses noise and 

30 speech at speaking location 18. A first near-end loudspeaker 24 introduces sound into the near-end zone 12 at speak- 
ing location 16. A second near-end loudspeaker 26 introduces sound into the near-end zone 12 at speaking location 
18. It is preferred that the first near-end microphone 20 be located in close proximity to the first speaking location 16 in 
the near-end acoustic zone 12, such as on the ceiling of the vehicle 15 directly above the speaking location 16 or on a 
seat belt worn by a driver or passenger located in speaking location 16. Likewise, it is preferred that the second near- 

35 end microphone 22 be located in close proximity to the second near-end speaking location 1 8 in the near-end acoustic 
zone 12. Because of the close proximity between speaking locations 16 and 18, the microphones 20 and 22 in the near- 
end zone will typically be coupled acoustically. For instance, sound present at speaking location 1 6 in the near-end zone 
12 is detected primarily by the first microphone 20 but can also be detected to some extent by the second microphone 
22 in the near-end zone 12, and vice-versa. The first near-end microphone 20 generates a first near-end voice signal 

40 that is transmitted through line 28 to an electronic controller 30. Likewise, the second near-end microphone 22 gener- 
ates a second near-end voice signal that is transmitted through line 32 to the electronic controller 30. 
[0020] The far-end zone 1 4 in the vehicle 1 5 includes a first speaking location 34 and a second speaking location 36. 
A first far-end microphone 38 senses noise and speech at speaking location 34. A second far-end microphone 40 
senses noise and speech at speaking location 36. A first far-end loudspeaker 42 introduces sound into the far-end zone 

45 14 at speaking location 34. A second far-end loudspeaker 44 introduces sound into the far-end zone 14 at speaking 
location 36. The first far-end microphone 38 generates a first far-end voice signal in response to noise and speech 
present at speaking location 34. The first far-end voice signal is transmitted through line 46 to the electronic controller 
30. The second far-end microphone 40 generates a second far-end voice signal in response to noise and speech 
present at speaking location 36. The second far-end voice signal is transmitted through line 48 to the electronic control- 

so ler 30. It is preferred that the first far-end microphone 38 be located in close proximity to the first far-end speaking loca- 
tion 34 in the far-end acoustic zone. Likewise, it is preferred that the second far-end microphone 40 be located in close 
proximity to the second far-end speaking location 36 in the far-end zone 14. The first far-end microphone 38 and the 
second far-end microphone 40 are acoustically coupled, inasmuch as speech present at speaking location 34 is sensed 
primarily by the first far-end microphone 38 but is also sensed to some extent by the second far-end microphone 40, 

55 and vice-versa. 

[0021 ] The electronic controller 30 outputs a first near-end input signal in line 50 that is transmitted to the first near- 
end loudspeaker 24. The electronic controller 30 also outputs a second near-end input signal that is transmitted through 
line 52 to the second near-end loudspeaker 26. In addition, the electronic controller outputs a first far-end input signal 



5 



EP 0 932 142 A2 

that is transmitted through line 54 to the first tar-end loudspeaker 42. The electronic controller also outputs a second 
far-end input signal that is transmitted through line 56 to the second far-end loudspeaker 44. 
[0022] As described thus far, the system 10 can be used to provide voice enhancement and facilitate conversation 
between a passenger or driver seated in the near-end zone 12 and a passenger seated in the far-end zone 14, or vice- 
5 versa. Fig. 1 also shows a cellular telephone 58 integrated into the system 10- The electronic controller 30 outputs a 
telephone input signal Txo Ut that is transmitted through line 60 to the cellular telephone 58. The electronic controller 30 
also receives a telephone receive signal Rx jn from the cellular telephone through line 62. In this manner, the electronic 
controller 30 communicates with the cellular telephone 58 to provide for a hands-free cellular telephone system within 
the vehicle 16. 

10 [0023] Figs. 2A and 2B explain voice activated switching as preferably implemented for both the near-end micro- 
phones 20 and 22 and the far-end microphones 38 and 40. Fig. 2A illustrates microphone input in terms of sound level 
(dB), and Fig. 2B illustrates voice activated switching of microphone output between an "off" state and an "on" state in 
relation to the microphone input shown in Fig. 2A. Microphone input sound level (dB) is preferably determined using a 
short-time, average magnitude estimating function to detect whether speech is present. Other suitable estimating tune- 
rs tions are disclosed in Digital Processing of Speech Signals. Lawrence R. Raviner, Ronald W. Schafer, 1978. Bell Lab- 
oratories, Inc., Prentice Hall, pages 120-126. While each microphone 20, 22, 38 and 40 transmits a full signal to the 
electronic controller 30, the electronic controller 30 includes a gate/switch that reduces the transmission of a respective 
microphone signal at least when the sound level for the signal does not exceed the threshold switching value. Fig. 2A 
illustrates that background noise present within the vehicle, time periods 64A, 64B, 64C and 64D, generally has a sound 

20 level less than a threshold switching value depicted by dashed line 66. On the other hand, speech present during time 
periods 68A and 68B generally has a sound level exceeding the threshold switching value 66. Microphone output 
remains in an "off state before speech is sensed by a respective microphone. Microphone output switches into an "on" 
state once speech is present in a speaking location associated with the microphone, given that no other microphones 
are switched into an "on" state. Fig. 2B shows microphone output initially in an "off" state, reference 70, which corre- 

25 sponds to time period 64A in Fig. 2A in which only background noise is present in the microphone signal. Note that in 
the "off" state 70, microphone output is preferably set to approximately 20% of the microphone output in the "on" state. 
Fig. 2B shows microphone output switching to an "on" state 72 when speech is present and microphone input exceeds 
the threshold switching value 66, region 68A in Fig. 2A. Microphone input sound level (dB) is preferably measured in 
approximately 12 millisecond windows, thus a microphone can be switched into the "on" state at a rate faster than is 

30 perceptible during normal conversation. 

[0024] Fig. 2B further illustrates that microphone output remains in an "on" state even if the microphone input sound 
level falls below the threshold switching value 66 for a relatively short amount of time. That is, microphone output holds 
in an "on" state for at least a holding time period t H , which is preferably equal to approximately one second. Once the 
microphone input sound level drops below the threshold switching value 66 for more than the holding time period t H , the 

35 microphone output fades 74 from the "on" state 72 to the "off" state 76. It is desirable that microphone output when the 
microphone is in the "off state be greatly reduced, e.g. approximately 20% or less for cellular telephone transmission 
and approximately 1%-10% for voice enhancement transmission, but not completely eliminated. If microphone output 
is completely eliminated when the microphone is in the "off" state, annoying microphone clicking will occur, and the line 
will appear dead when the microphone is in the "off" state. Providing a low-level of microphone output when the micro- 

40 phone is in the "off" state facilitates natural sounding voice enhancement and practical telephone signal transmission. 
[0025] When generating the telephone input signal Txo Ut for the cellular telephone 58, it is desirable that no more than 
one of the microphones 20, 22, 38 or 40 be switched into the "on" state at any given time. This facilitates intelligibility of 
the transmitted cellular telephone signal to a listener on the other end of the line when two or more persons in the vehi- 
cle 15 are competing, and also prevents acoustic spill over between acoustically coupled microphones such as micro- 

45 phones 20 and 22 or 38 and 40. Although it is desirable that microphone output remain at a low level when a 
microphone is switched in an "off" state (e.g. approximately 20%), the presence of several microphones in a system can 
create distortion, which is especially problematic for the single telephone input signal Tx^ transmitted to the cellular 
telephone 58. The background noise that is present on the signal corresponding to the microphone in the "on" state is 
also problematic for Txo Utl since the listener on the other end of the line is typically in a quiet environment making such 

so noise objectionable. Thus, it is preferred that the telephone input signal T^ be filtered to remove the background noise 
before transmission of the signal to the cellular telephone 58. 

[0026] Fig. 3A illustrates a single channel (SISO) integrated voice enhancement system and hands-free cellular tele- 
phone system 78 that includes a microphone steering switch 80 and a noise-reduction filter 82 for the telephone input 
signal T^. In many respects, the SISO system 78 shown in Fig. 3A is similar to the system 10 shown in Fig. 1 and like 
55 reference numerals are used where appropriate to facilitate understanding. In Fig. 3A, the near-end microphone 20 
senses sound in the near-end zone 12 and generates a near-end voice signaJ that is transmitted through line 28 to a 
near-end echo cancellation summer 84. A near-end adaptive acoustic echo canceller 86 inputs the near-end input sig- 
nal from line 50. The near-end adaptive echo canceller 86 outputs a near-end echo cancellation signal in line 88 that 
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inputs the near-end echo cancellation summer 84. The near-end acoustic echo canceller 86 is preferably an adaptive 
finite impulse response filter having sufficient tap length to model the acoustic path between the near-end loudspeaker 
24 and the output of the near-end microphone 20. The near-end acoustic echo canceller 86 is preferably adapted using 
an LMS update or the like, preferably in accordance with the techniques disclosed in copending patent application 

5 Serial No. 08/626,208. entitled "Acoustic Echo Cancellation In An Integrated Audio And Telecommunication Intercom 

System", by Brian M. Finn, filed on March 29, 1996, now U.S. Patent No. _ issued on 

. The near-end echo cancellation summer 84 subtracts the near-end echo cancellation signal in line 88 

from the near-end voice signal in line 28, and outputs an echo-cancelled, near-end voice signal in line 90. The near-end 
echo cancellation summer 84 thus subtracts from the near-end voice signal in line 28 that portion of the signal due to 

to sound introduced by the near-end loudspeaker 24. 

[0027] The echo-cancelled, near-end voice signal in line 90 is transmitted both to a far-end input summer 92 and 
through line 94 to the microphone steering switch 80. The far-end input signal 92 also receives components of the far- 
end input signal other than the echo-cancelled near-end voice signal, such as a cellular telephone receive signal Rxj n 
from line 96 or an audio feed (not shown), etc. The far-end input summer 92 outputs the far-end input signal in line 54 

is which drives the far-end loudspeaker 42. 

[0028] The far-end microphone 38 senses sound in the far-end zone 1 4 at speaking location 34 and generates a far- 
end voice signal that is transmitted through line 46 to a far-end echo cancellation summer 98. A far-end adaptive acous- 
tic echo canceller 100. preferably identical to the near-end adaptive acoustic echo canceller 86, receives the far-end 
input signal in line 54 and outputs a far-end echo cancellation signal in line 102. The far-end echo cancellation signal in 

20 line 1 02 inputs the far-end echo cancellation summer 98. The far-end echo cancellation summer 98 subtracts the near- 
end echo cancellation signal in line 102 from the far-end voice signal in line 46 and outputs an echo-cancelled, far-end 
voice signal in line 104. The far-end echo cancellation summer 98 thus subtracts from the far-end voice signal in line 46 
that portion of the signal due to sound introduced by the far-end loudspeaker 42. The echo-cancelled, far-end voice sig- 
nal in line 104 is transmitted to both a near-end input summer 106, and to the microphone steering switch 80 through 

25 line 108. A privacy switch 1 1 0 is located in line 108, thus allowing a passenger or driver within the vehicle to discontinue 
transmission of the far-end echo-cancelled voice signal to the microphone steering switch 80 by opening the privacy 
switch 110. A similar privacy switch 1 12 is located in line 96 between the cellular telephone 58 and the far-end input 
summer 92 which enables a driver and/or passenger within the vehicle to discontinue transmission of the telephone 
receive signal Rx jn from the cellular telephone 58 to the far-end loudspeaker 42 in the far-end zone 14. 

30 [0029] The near-end input summer 1 06 also receives other components of the near-end input signal, such as the cel- 
lular telephone receive signal Rx jn in line 114 or an audio feed (not shown), etc. The near-end input summer 106 out- 
puts the near-end input signal in line 50 which drives the near-end loudspeaker 20. 

[0030] Assuming that privacy switch 1 10 in line 108 is closed, the microphone steering switch 80 receives both the 
echo-cancelled near-end voice signal through line 94 and the echo-cancelled far-end voice signal through line 108. The 
35 microphone steering switch 80 combines and/or mixes the echo-cancelled voice signals preferably in the manner 
described with respect to Figs. 4-7, and outputs a raw telephone input signal in line 1 16. In accordance with the inven- 
tion, the raw telephone input signal 116 inputs the noise reduction filter 82. The noise reduction filter 82 outputs a noise- 
reduced telephone input signal Txo Ut that inputs the cellular telephone 58. 

[0031 ] Fig. 3B illustrates a single channel (SISO) integrated voice enhancement system and hands-free cellular tele- 
40 phone system 78a which is similar to the system 78 shown in Fig. 3A. The primary difference in the system 78a in Fig. 
3B is that the single noise reduction filter 82 in the system 78 shown in Fig. 3A has been replaced by a plurality of noise 
reduction filters 82a, 82b. Noise reduction filter 82a is located in the near-end voice signal line 90. Noise reduction filter 
82b is located in the far-end voice signal line 104. In addition to improving the clarity of the telephone input signal, Tx^, 
this implementation also removes the background noise in the voice signals themselves. Noise reduction filter 82a 
45 removes the background noise in the near-end voice line 90 and therefore prevents the rebroadcasting of this noise on 
the far-end loudspeaker 42. Likewise, noise reduction filter 82b removes the background noise in the far-end voice line 
104 and therefore prevents the rebroadcasting of this noise on the near-end loudspeaker 24. In other respects, the sys- 
tem 78a shown in Fig. 3B is similar to the system 78 shown in Fig. 3A. 

[0032] Figs. 4-7 illustrate the preferred microphone steering technique for the cellular telephone input signal which is 
so implemented by the microphone steering switch 80. Fig. 4 is a state diagram for voice activated switching between the 
near-end microphone 20 labelled MIC 1 and the far-end microphone 38 labelled MIC 2. As shown in the state diagram 
of Fig. 4, only one of the microphones 20, 38 can.be switched into the "on" state at any given time. The idle state 120 
indicates a state in which both microphones 20, 38 are in an "off state. From the idle state 120. rt is possible for either 
the near-end microphone 20, MIC 1 , to switch into an "on" state 1 22 or for the far-end microphone 38, MIC 2, to switch 
55 into an "on" state 1 24. Arrows 1 22A and 1 24A from the idle state 1 20 illustrate that it is not possible for both of the micro- 
phones 20 and 38 to be in the "on" state contemporaneously. Fig. 5 graphically depicts switching near-end microphone 
20 output, MIC 1. into an "on" state 1 22 when the system is initially in the idle state 1 20. More specifically, the near-end 
microphone 20, MIC 1 , senses background noise and speech within the vehicle and generates a respective microphone 



7 



EP 0 932 142 A2 



signal in response thereto. The magnitude of the microphone signal is determined in accordance with the voice acti- 
vated switching technique illustrated in Figs. 2A and 2B. Microphone output for the microphone 20, MIC 1 . is maintained 
in the "off state if the magnitude of the microphone signal is below the threshold switching value 66. However, if initially 
the system is in the idle state 120 (i.e. the sound level for both the near-end microphone 20, MIC 1, and the far-end 

5 microphone 38. MIC 2. have remained below the threshold switching value 66), the first microphone having a micro- 
phone signal with a magnitude exceeding the threshold switching value 66 switches to the "on" state. Fig. 5 shows the 
near-end microphone 20 output switching from an "off" state 126 to an "on" state 128. The microphone selected to be 
in the "on" state is referred herein as the designated primary microphone. The raw telephone input signal in line 116 
from the microphone steering switch 80 is preferably a combination of the full echo^cancelled voice signal from the pri- 

10 mary microphone and approximately 20% of the echo-cancelled voice signal from the other microphone. 

[0033J Whenever either the near-end microphone 20, MIC 1 , or the far-end microphone 38, MIC 2, are designated as 
the primary microphone (i.e., the microphone output is switched to an "on" state), the microphone holds in the "on" state 
even after the sound level of the microphone signal falls below the threshold switching value 66 for the holding time 
period t H - However, after the holding time period tH expires, the microphone output for the primary microphone enters 

75 a fade-out state 1 30. Fig. 4, as long as the sound level for the other microphone does not exceed the threshold switching 
value 66. In Fig. 4. lines 122B and 124B illustrate respective microphones MIC 1 and MIC 2 entering the fade-out state 
130. Line 130 A illustrates that after the microphone completes the fade-out state 130, the system enters the idle state 
120. Fig. 7 graphically depicts the switching action for the near-end microphone 20 output through the fade-out state 
130. Microphone output begins in the "on" state 132, and holds in the "on" state for the holding time period 134 even 

20 after the sound level for the microphone 20 signal falls below the threshold switching value 66. When the holding time 
period tn expires, the microphone 20 output enters the fade-out state 130 in which the microphone output fades from 
the "on" state 134 to the "or state 136. The preferred fade-out time period t H is approximately three seconds. 
[0034] When the near-end microphone 20, MIC 1 , is designated as the primary microphone, state 122, or the far-end 
microphone 38, MIC 2, is designated as the primary microphone, state 124, and the sound level of the other micro- 

25 phone exceeds the threshold switching value 166, it may be desirable under some circumstances to cross-fade 
between the microphones as illustrated by cross-fade state 1 38. Fig. 4. Line 1 22C pointing towards the cross-fade state 
138 illustrates the near-end microphone 20, MIC 1. as the designated primary microphone, cross-fading from the "on" 
state 122 to the "or state. Line 124C from the cross-fade state 138 illustrates that the far-end microphone 38, MIC 2, 
contemporaneously fades on from the "off" state to the "on" state 124 to become the designated primary microphone. 

30 Figs. 6A and 6B graphically depict the switching action for the cross-fading state 1 38 illustrated by lines 1 22C and 1 24C 
and cross-fading state 138. Fig. 6A shows the near-end microphone 20, MIC 1 , switching from the "off" state 1 40 to the 
"on" state 142 as in accordance with line 122A and state 122 in Fig. 4, thus designating the near-end microphone 20, 
MIC 1 , as the primary microphone. During the same time period, the far-end microphone 38, MIC 2, remains in the "off" 
state, reference numeral 144 and 146 in Fig. 6B. If the sound level for the far-end microphone 38, MIC 2, exceeds the 

35 threshold switching value 66 after the near-end microphone 20, MIC 1, has been designated as the primary microphone 
(i.e. the sound level for the far-end microphone 38, MIC 2. exceeds the threshold switching value 166 during the time 
period designated by reference numeral 146 in Fig. 6B), the far-end microphone 38, MIC 2, is designated as a priority 
requesting microphone. The designated priority requesting microphone requests priority to become the designated pri- 
mary microphone, but does not enter the "on" state until the designated primary microphone relinquishes priority, even 

40 though the sound level for the priority requesting microphone exceeds the threshold switching value 66. In other words, 
the designated priority switching microphone cannot become the designated primary microphone until the designated 
primary microphone relinquishes priority. At the instant that the designated primary microphone relinquishes priority, 
reference numeral 148 in Figs. 6A and 6B, the designated primary microphone (near-end microphone 20, MIC 1 , in Fig. 
6A) fades out from the "on" state 142 to the "off" state 150, as indicated by reference numeral 152 in Fig. 6A. and the 

45 far-end microphone 38, MIC 2. contemporaneously cross-fades on from the "off" state 1 46 to the "on" state 1 54 as illus- 
trated by reference numeral 156. The designated primary microphone (i.e. the near-end microphone 20. MIC 1 in Fig. 
6A) relinquishes priority if the holding time period t H expires while the priority requesting microphone (i.e. the far-end 
microphone 38, MIC 2 in Fig. 6B), is requesting priority (i.e. the sound level of the echo-cancelled, far-end voice signal 
in line 1 08, Fig. 3, exceeds the threshold switching value 1 66). In addition, it is preferred in some circumstances that the 

so designated primary microphone relinquish priority even before the expiration of the holding time period ^ if statistically 
it is determined that the sound level for the priority requesting microphone is sufficiently high compared to the sound 
level for the designated primary microphone. For instance, it may be desirable for the designated primary microphone 
to relinquish priority when the sound level for the priority requesting microphone exceeds the sound level for the desig- 
nated priority microphone on a time-averaged basis by 50% for at least one second. 

55 [0035] In Fig. 4, line 124D pointing towards the cross-fade state 138 illustrates that the far-end microphone 38, MIC 
2, cross-fades from the "on" state to the "or state. Line 122D from the cross-fade state 138 illustrates that contempo- 
raneously the near-end microphone 20, MIC 1, cross-fades on from the "off" state to the "on" state. Cross-fading from 
the far-end microphone 38, MIC 2, as the designated primary microphone, state 124. to the near-end microphone 20, 
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MIC 1, as the designated primary microphone, state 122, is accomplished in the same manner as shown in Figs. 6A 
and 6B and as described above with respect to a cross-fade from the near-end microphone 20, MIC 1 , to the far-end 
microphone 38, MIC 2. 

[0036] Fig. 8A illustrates the preferred noise reduction filter 82 which receives the raw telephone input signal desig- 
5 nated as x(k) in line 116 from the microphone steering switch 80 and system 78 shown in Fig. 3A. The same noise 
reduction filter 82 is preferably used in the system 78a shown in Fig. 3B at the locations of noise reduction filters 82a, 
82b to operate on the near-end and far-end voice signals, respectively. For the sake of clarity, the following discussion 
relating to noise reduction filter 82 assumes that the noise reduction filter 82 is in the location shown in Fig. 3A. The raw 
telephone input signal x(k) in line 116 inputs a plurality of M fixed filters h 0 , h 1( h 2 ...h M . 2 . h M . v The plurality of fixed filters 
10 h 0 h 1( h 2 ...h M . 2 , h M .i preferably span the audible frequency spectrum. Each of the fixed filters outputs a respective fil- 
tered telephone input signal z 0 (k), z^), z 2 (k)...z M . 2 (k), z M ..,(k). The fixed filters are preferably a reclusive implementa- 
tion of a discrete cosine transform in the time domain modified to stabilize performance on digital signal processors, 
however, other types of fixed filters can be used in accordance with the invention. For instance, Karhunen-Loeve trans- 
forms, wavelet transforms, or even the eigen filters for an eigen filter adaptation band filter (EAB) or an eigen filter filter 
is bank (EFB) as disclosed in U.S. Patent No. 5,561 ,598, entitled "Adaptive Control System With Selectively Constrained 
Output And Adaptation" by Michael P. Nowak et al., issued on October 1, 1996, herein incorporated by reference, are 
examples of other fixed filters that may be suitable for the noise reduction filter 82. 

[0037] In the preferred embodiment of the invention, the plurality of fixed filters h 0 , h lt h 2 ...h M . 2 , h M _i are infinite 
impulse response filters in which the filtered telephone input signals zo(k), z^), z 2 (k)...z M . 2 (k), z M _i(k) are represented 
20 by the following expressions: 



lx(k) - y" x (k-M)] + y z 0 (*-l) (Eq.l) 

25 

for fixed 



z D (k) = 



30 



zAk) 



(y^j [(x{k) - y x(Jc-l) + (-l) m y^xdc-lAf+l] ) 
(~l) m y M x(k-M)] + 2 y cos(^ ) z a (k-l) - y 2 zJk-2) (Eg. 2) 



— cos 2 
M 



35 

filter ho; and for fixed filters , h 2 ...h M . 2 . h M _i ; 

where y is a stability parameter, x(k) is the raw telephone input signal for sampling period k t M is the number of fixed 
filters h 0 , h 1 , h 2 ...h M _ 2 , h M .-, , and z m is the filtered telephone input signal for the m m filter h 0 , h 1t h 2 ...h M . 2 , h M ^ . The sta- 
bility parameter y used in Equations 1 and 2 should be set to approximately 1 , for example 0.975. The implementation 
40 of Equations 1 and 2 in block form is shown schematically in Figs. 8B, 8C and 8D. In Fig. 8B (Equation 2), the blocks 
labelled RT 1( RT 2 , RT 3 , RT 4 ...RT M . 2 , and RT^ designate the recursive portions of the fixed filters h 1f h 2 , h 3 , h 4 ...h M . 
2l and h M . 1( respectively. Fig. 8D illustrates the implementation of RT m for the m m filter h lt h 2 , h 3 , h 4 ...h M . 2 , and h M _ v 
The implementation of fixed filter h 0 in accordance with Equation 1 is shown in Fig. 8C. 

[0038] Alternatively, the fixed filters h 0 , , h 2 ...h M . 2 , may be realized by finite impulse response filters. The pre- 
45 ferred transform as represented by a set of finite impulse response filter is given by the following expressions: 



50 



M-l 

z m (k)= S h a {n) x(k-n) 
n=o 



M-l 
z m (k)= E 
n=o 



i5- yn COS ( n(2 ^ 1)/n \ 

M- \ 2 M J 



x(k-n) 



(Eg. 3) 



where M is the number of fixed filters h 0 , h, , h 2 ...h M _ 2 , h M .-, , h m (n) is the n m coefficient of the m m filter, x(k-n) is a time- 
shifted version of the raw telephone input signal x(k), n=0,1 ...M-1 , z m (k) is the filtered telephone input signal for the m* 1 
filter ho, h 1( h 2 ...h M . 2 , h M . 1( y is a stability parameter, G m =1 for m=0 and G m =2 for m*0. 
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[0039] The preferred transforms expressed in Equations 1 through 3 can be implemented efficiently, especially in the 
IIR form of Equations 1 and 2. From a theoretical standpoint, the Karhunen-Loeve transform is probably optimal in the 
sense that it orthogonalizes or decouples noisy speech signals into speech and noise components most effectively 
However, the transform of Equations 1 and 2 can also be used to compute orthogonal filtered telephone input signals 
zo(k). zi( k ). Z2(k) -z M . 2 (k), ZM.^k) for each sample period. Further, the transform filter coefficients and the filter output 
are real values, therefore no complex arithmetic is introduced into the system. 

[0040] The fixed filters h 0 , h 1 , h 2 ...h M . 2 , h M .<, act as a group of band pass filters to break the raw telephone input signal 
x(k) into M different frequency bands of the same bandwidth. For example, filter h m has a band pass from about (Fg/(M)) 
(m-.5) Hz to (Fg/(2M)) (rm.5) Hz resulting in a bandwidth of F S /(2M) Hz, where F 8 is the sampling frequency Thus, pro- 
viding more fixed filters h 0 , h 1f h2...h M . 2 . h M .t (i.e. the greater the value is for the number M) improves the frequency 
resolution of the system 82. In general, the number of fixed filters h 0 , h 1 , h 2 ...h M . 2 , h M .-| is chosen to be as large as pos- 
sible and is limited to the amount of processing power available on the electronic controller 30 for a particular sampling 
rate. For instance, if the electronic controller 30 has a digital signal processor which is a Texas Instrument 
TMS320C30DSP running at 8kHz, the system should preferably have approximately 20-25 fixed filters h 0 , , h 2 ...h M . 

2- h M-V 

[0041 ] Each of the filtered telephone input signals z 0 (k), z^), z 2 (k)...z M . 2 (k), z M .-,(k) is weighted by a respective time- 
varying filter gain element p 0 (k), p^k), P2(k). .pM-2( k ). PnmM- Each of the time-varying filter gain elements po(k), Pi(k), 
p 2 (k)...p M . 2 (k), Pm-iM is preferably determined in accordance with the following expression: 



SJk) 



1- 



SSL m {k) +a 



(Eg. 4) 



where p m (k) is the value of the time-varying filter gain element associated with the m m fixed filter h 0 , h 1f h2...h M . 2 , h M .i 
at sampling period k, SSL^k) is the speech strength level for the respective filtered telephone input signal zo(k), z^k), 
z 2 (k)...z M . 2 (k), z M .-i(k) at sampling period K and u. and a are preselected performance parameters having values 
greater than 0. It has been found that selecting u equal to approximately 4, and a equal to approximately 2 provides 
adequate noise reduction while retaining natural sounding processed speech. If the noise power for a frequency band 
is excessive, it can be useful in some applications to set the corresponding time-varying gain element P m (k) = 0 . The 
time-varying filter gain elements p 0 (k), p^k), P 2 (k)...p M . 2 (k), Pm-iW each output a respective weighted and filtered tel- 
ephone input signal in lines 158A, 158B, 158C, 158D, and 158E, respectively. The weighted and filtered telephone input 
signals are combined in summer 160 which outputs the noise-reduced telephone input signal TXo Ut (k) in line 118. The 
noise-reducing filtering technique shown in Fig. 8 is particularly useful because it is implemented on a sample-by-sam- 
ple basis, and does not require an explicit inverse transform. Noise reduction filtering is accomplished on-line in real 
time. 

[0042] The speech strength level SSLJk) for the respective filtered telephone input signal z 0 (k), z^), z 2 (k)...z M . 2 (k), 
ZM-i( k ) at sample period k is determine in accordance with the following expression: 

sslju - s - pwrJ ( k k \ <*r.5> 

" n__pwr m {k) 



where s_pwr m (k) is an estimate of combined speech and noise power in the m th filtered telephone input signal zo(k), 
Zi(k), Z2(k)- z M-2( k ). 2 M-i( k ) at sample period k and n_pwr m (k) is an estimate of noise power in the m* 1 filtered tele- 
phone input signal of sample period k. It is preferred that the combined speech and noise power level s _pwr m (k) for the 
respective filtered telephone input signal zo(k), z^k), z 2 (k)...z M . 2 (k), z M .<|(k) at sample period k be estimated in accord- 
ance with the following expression: 

s_pwr m (k) = s_pwr m (k-1) + X m (z m (k) * z m (k) - s_pwr m (k-1)) (Eq. 6) 

where is a fixed time constant that is in general different for each of the M fixed filters h 0 , h 1( h 2 ...h M . 2 , h M . 1( and 
z m (k) is the value of the respective filtered telephone inputs Zo(k), z-,(k), z 2 (k)...z M . 2 (k), z M0 (k) at sample period k taken 
when speech is present in the raw telephone input signal x(k), or in other words, when the input line is in the "on" state. 
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The time constants Xm are determined so that the effective length. of the averaging window used to estimate the power 
in a particular frequency band is proportional to the center frequency of the frequency band. In other words, the time 
constant X m increases to yield a faster estimation of speech and noise power level as the center frequency of the band 
increases. This ensures a fast overall dynamic system response. The time constants X m are preferably less than 0.10 
5 and greater than 0.01. 

[0043] The noise power level estimate n_pwr m (k) for the filtered telephone input signals z 0 (k), z^k), z 2 (k)...z M . 2 (k), 
ziu-i(k) used for sample period k is preferably estimated in accordance with the following expression: 

n_pwr m (k) = n_pwr m (k-1) + X 0 (z m (k) * z m (k) - n_pwrjk-1)) (Eq. 7) 

10 

where z m (k) is the value of the respective filtered telephone input signal Zo(k), z^k), z 2 (k)...z M _ 2 (k), z M .-,(k) at sample 
period k taken when speech is not present in the raw telephone input signal x(k), and Xq is a fixed time constant prefer- 
ably set to a small value, such as Xq equal to approximately 10~ 3 . Setting fixed time constant Ao to a small value provides 
a long averaging window for estimating the noise power level n_pwr m (k). 

is [0044] The noise reduction filter 82 generally has two modes of operation, a noise estimation mode and a speech 
filtering mode. In the noise estimation mode, background noise for each band corresponding to the fixed filters h 0 , h 1p 
h 2 ...h M _ 2 , h M ..| is estimated. In order to track changes in noise conditions within the vehicle 15, the noise reduction filter 
82 periodically returns to the noise estimation mode when speech is not present in the raw telephone input signal x(k) 
(i.e. when the microphone steering switch 80 is switched to the idle state 120, Fig. 4). In practice, it is desirable to esti- 

20 mate only the stationary background noise present on the microphone signals (i.e., background noise which statistically 
does not vary substantially over time). This is accomplished by setting a time constant Xq equal to a small value, such 
as Xq equal to approximately 10* 3 . 

[0045] When speech is present in the raw telephone input signal x(k), the system operates in the speech filtering 
mode. After estimating the combined speech and noise power level s_pwr m (k) at the sample period k for each of the 

25 filtered telephone input signals Zo(k), z^k), z 2 (k)...z M . 2 (k) ( z^.^k), the respective time-varying filter gain elements Po(k), 
Pi(k). P2(k)...p M . 2 (k). Pm-tM are adjusted between 0 and 1 according to the signal-to-noise power ratio SSL^k) corre- 
sponding to each filtered telephone input signal zo(k), z^k), z 2 (k)...z M . 2 (k), z Ma1 (k), Eq. 4. Fa example, if the speech 
strength level is large in a particular band, the corresponding gain element will be approximately one, thus passing the 
speech on this band. If the SSL is small, the corresponding gain element will be approximately zero, thus removing the 

30 noise in this band. As mentioned above, it may be useful to set P m (k) - 0 when n_pwr m (k) is greater than a preselected 
threshold value. In this manner, the time-varying filter gain elements p 0 (k), p^k), feW-Pwi^M. Pw-i(k) track the char- 
acteristics of speech present within the raw telephone input signal x(k) and thereby create a more intelligible noise- 
reduced telephone input signal Tx out (k). 

[0046] Fig. 9A schematically illustrates the MIMO integrated vehicle voice enhancement system and hands-free cei- 
35 lular telephone system 10 illustrated in Fig. 1 . In many respects, the MIMO system 10 shown in Fig. 9 is similar to the 
SI SO system 78 shown in Fig. 3, and like reference numerals will be used where helpful to facilitate understanding of 
the invention. 

[0047] In Fig. 9A, the first near-end microphone 20 senses speech and noise present at speaking location 16 and 
generates a first near-end voice signal that is transmitted through line 28 to a first near-end echo cancellation summer 

40 162 A. The first near-end echo cancellation summer 162A also inputs a first near-end echo cancellation signal from line 
164A and a third near-end echo cancellation signal from line 164C. The first near-end echo cancellation signal in line 
1 64A is generated by a first near-end adaptive acoustic echo canceller AEC^ ^ (1 1 . The first near-end adaptive echo can- 
celler .AEC 11(11 (as well as the other adaptive echo cancellers in Fig. 9 AEC 1112 , AEC 12|11 , AEC 12 12 , AEC 21 21 , 
AEC 21 22 , AEC^ 21 , and AEC^ 22 ) is preferably an adaptive FIR filter as discussed with respect to Fig. 3, and inputs a 

45 first near-end input signal in line 54 that drives the first near-end loudspeaker 24. The third adaptive echo canceller 
AEC 1211 inputs a second near-end input signal in line 52 that drives the second near-end loudspeaker 26, and outputs 
the third near-end echo cancellation signal in line 164C. The first near-end echo cancellation summer 162A subtracts 
the first near-end echo cancellation signal in line 1 64A and the third near-end echo cancellation signal in line 1 64C from 
the first near-end voice signal in line 28 to generate a first echo-cancelled, near-end voice signal in line 166A. The first 

so adaptive acoustic echo canceller AEC 11f11 adaptively models the path between the first near-end loudspeaker 24 and 
the output of the first near-end microphone 20. The third adaptive acoustic echo canceller AEC 1211 adaptively models 
the path between the second near-end loudspeaker 26 and the output from the first near-end microphone 20. Thus, the 
first near-end echo cancellation summer 1 62 A subtracts from the first near-end voice signal in line 28 that portion of the 
signal due to sound introduced by the first near-end loudspeaker 24, and also that portion of the signal due to sound 

55 introduced by the second near-end loudspeaker 26. The first echo-cancelled, near-end voice signal in line 1 66 is trans- 
mitted to both a far-end voice enhancement steering switch 168A and also to a telephone steering switch 80 A through 
line170A. 

[0048] The second near-end microphone 22 senses speech and noise present at speaking location 18 and outputs 
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a second near-end voice signal through line 32 to a second near-end echo cancellation summer 162B. The second 
near-end echo cancellation summer 162B also receives a second near-end echo cancellation signal in line 164B and 
a fourth near-end echo cancellation signal in line 164D. The second near-end echo cancellation in line 164B is gener- 
ated by a second near-end adaptive acoustic echo canceller AEC 12 (12 . ™e second near-end adaptive acoustic echo 

5 canceller AEC 12) 12 inputs the second near-end input signal in line 52 which drives the second near-end loudspeaker 
26. The fourth near-end echo cancellation signal in line 164D is generated by a fourth near-end adaptive acoustic echo 
canceller AEC 1112 . The fourth near-end adaptive acoustic echo canceller AEC 1112 inputs the first near-end input sig- 
nal in line 54 that drives the first near-end loudspeaker 24. The second near-end echo cancellation summer 162B sub- 
tracts the second near-end echo cancellation signal in line 164B and the fourth near-end echo cancellation signal in line 

io 164D from the second near-end voice signal in line 32 to generate a second echo-cancelled, near-end voice signal in 
line 166B. The second near-end adaptive acoustic echo canceller AEC 12 12 adaptively models the path between the 
second near-end loudspeaker 26 and the output of the second near-end microphone 22. The fourth near-end adaptive 
acoustic echo canceller AEC| ., 12 adaptively models the path between the first near-end loudspeaker 24 and the output 
of the second near-end microphone 22. Thus, the second near-end echo cancellation summer 162B subtracts from the 

js second near-end voice signal in line 32 that portion of the signal due to sound introduced by the second near-end loud- 
speaker 26, and also that portion of the signal due to sound introduced by the first near-end loudspeaker 24. The sec- 
ond echo-cancelled, near-end voice signal in line 166B is transmitted to both the far-end voice enhancement steering 
switch 168A, and to the telephone steering switch 80 A through line 170B. 

[0049] The first far-end microphone 38 senses speech and noise present at speaking location 34 within the far-end 
20 zone 1 4 and generates a first far-end voice signal that is transmitted through line 46 to a first far-end echo cancellation 
summer 1 72A. The first far-end echo cancellation summer 1 72A also inputs a first far-end echo cancellation signal from 
line 174A and a third far-end echo cancellation signal from line 174C. The first far-end echo cancellation signal in line 
1 74 A is generated by a first far-end adaptive acoustic echo canceller AEC 21 21 . The first far-end adaptive acoustic echo 
canceller AEC 212 i inputs a first far-end input signal in line 54 that drives the first far-end loudspeaker 42. The third far- 
25 end echo cancellation signal in line 1 74C is generated by the third far-end adaptive acoustic echo canceller AEC^i- 
The third far-end adaptive echo canceller AEC^ 21 inputs a second far-end input signal in line 56 that also drives the 
second far-end loudspeaker 44. The first far-end adaptive acoustic canceller AEC 2121 models the path between the 
first far-end loudspeaker 42 and the output of the first far-end microphone 38. The third far-end adaptive acoustic echo 
canceller AEC 22i21 models the path between the second far-end loudspeaker 44 and the output of the first far-end 
30 microphone 38. The first far-end echo cancellation summer 172 subtracts the first far-end echo cancellation signal in 
line 1 74A and the third far-end echo cancellation signal in line 1 74C from the first far-end voice signal in line 46 to gen- 
erate a first echo cancelled, far-end voice signal in line 1 76A. The first echo-cancelled, far-end voice signal in line 1 76A 
is transmitted both to a near-end voice enhancement steering switch 168B, and also to the telephone steering switch 
80A through line 170C. 

35 [0050] The second far-end microphone 40 senses speech and noise present at speaking location 36 in the far-end 
zone 1 4 and generates a second far-end voice signal that is transmitted to a second far-end cancellation summer 1 72B 
through line 48. A second far-end echo cancellation signal in line 1 74B and a fourth far-end echo cancellation signal in 
line 174D also input the second far-end echo cancellation summer 172B. The second far-end echo cancellation signal 
in line 1 74B is generated by a second far-end adaptive acoustic echo canceller AEC 22 22 . The second far-end adaptive 

40 acoustic echo canceller AEC 22 22 inputs the second far-end input signal in line 56 which also drives the second far-end 
loudspeaker 44. The second far-end adaptive acoustic echo canceller AEC 2 2,22 models the path between the second 
far-end loudspeaker 44 and the output of the second microphone 40. The fourth far-end echo cancellation signal in 
1 74D is generated by a fourth far-end adaptive acoustic echo canceller AEC 2122 . The fourth far-end adaptive acoustic 
echo canceller AEC 21 22 inputs the first far-end input signal in line 54 that drives the first far-end loudspeaker 42. The 

45 fourth far-end adaptive acoustic echo canceller AEC 21 22 models the path between the first far-end loudspeaker 42 and 
the output of the second far-end microphone 40. The second far-end echo cancellation summer 1 72 B subtracts the sec- 
ond echo cancellation signal in line 174B and the fourth echo cancellation signal in line 174D from the second far-end 
voice signal in line 48 to generate a second echo-cancelled, far-end voice signal in line 176B. The second echo-can- 
celled, far-end voice signal in line 176B is transmitted to both the near-end voice enhancement steering switch 168B, 

so and also to the telephone steering switch 80A through line 1 70D. 

[0051 ] The telephone steering switch 80 A outputs a raw telephone input signal in line 1 1 6 preferably in accordance 
with the state diagram shown in Fig. 10. The raw telephone input signal in line 116 inputs the noise reduction filter 82, 
which is preferably the same as the filter shown in Fig. 8. The noise reduction filter 82 outputs a noise-reduced tele- 
phone input signal Tx^k) to the cellular telephone 58. The cellular telephone 58 outputs a telephone receive signal 

55 Rx jn in line 178 that is eventually transmitted to the loudspeakers 24, 26, 42, and 44 in the system 10. 

[0052] Fig. 9A shows the telephone receive signal RXj n inputting block 168A, 168B which schematically illustrates 
both the near-end voice enhancement steering switch 168A and the far-end voice enhancement steering switch 168B. 
The far-end voice enhancement steering switch 168A operates generally in the same manner as the steering switch 80 
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shown in Fig. 3 and described in conjunction with Figs. 4 and 7, however, microphone output in the "off state for the 
far-end voice enhancement steering switch 168A preferably sets microphone output to 10% or less, rather than approx- 
imately 20%. The far-end voice enhancement steering switch 168A thus selects and mixes the first and second echo- 
cancelled, near-end voice signals in line 1 66A and 166B and generates a far-end voice enhancement input signal in line 

5 180A. One purpose of the near-end voice enhancement steering switch 168B and of the far-end voice enhancement 
steering switch 168A is to reduce and/or eliminate microphone falsing within the respective acoustic zones 12, 14. For 
instance, both of the near-end microphones 20 and 22 are likely to sense speech from a single passenger and/or driver 
located in the near-end acoustic zone 12, especially if the driver and/or passenger is not located in close proximity to 
one of the microphones 20, 22 or the driver and/or passenger is speaking loudly (i.e., both of the near-end microphones 

w 20, 22 are acoustically coupled to one another). 

[0053] Fig. 9A shows the far-end voice enhancement input signal in line 1 80A being transmitted through line 1 82A to 
a first far-end audio summer 1 84A and also through line 182B to a second audio summer 184B. Block 186A illustrates 
the generation of a first far-end audio signal that is summed in summer 184A with the far-end voice enhancement input 
signal 182A to generate the first far-end input signal in line 54 that drives the first far-end loudspeaker 42. Block 186B 

is illustrates the generation of a second far-end audio signal that is summed in summer 184B with the far-end voice 
enhancement input signal in line 182B to generate the second far-end input signal in line 56 that drives the second far- 
end loudspeaker 44. 

[0054] The near-end voice enhancement steering switch 1 68B operates generally in the same manner as the far-end 
voice enhancement steering switch 168A. The near-end voice enhancement steering switch 168B selects and mixes 

20 the first and second echo-cancelled, far-end voice signals in lines 176A and 176B and generates a near-end voice 
enhancement input signal in line 180B. The near-end voice enhancement input signal in 180B is transmitted through 
line 188A to a first near-end audio summer 190A and through line 188B to a second audio summer 190B. Block 192 A 
illustrates the generation of a first near-end audio signal that is summed in summer 190A with the near-end voice 
enhancement input signal in line 188A to generate the first near-end input signal in line 54 that drives the first near-end 

25 loudspeaker 24. Block 192B illustrates the generation of a second near-end audio signal that is combined in summer 
190B with the near-end voice enhancement input signal in line 188B to generate the second near-end input signal in 
line 52 that drives the second near-end loudspeaker 26. 

[0055] When the telephone receive signal Rxj n is present in line 1 78, it is preferred that block 1 68A, 1 68B transmit the 
telephone receive signal Rx jn in both lines 180A and 180B, rather than a form of echo-cancelled voice signals from the 
30 respective microphones 20, 22, 38 and 40. In addition, it is desirable that audio input illustrated by blocks 186A, 186B, 
192A, 192B be suspended while the cellular telephone 58 is in operation. 

[0056] The MIMO system 10A shown in Fig. 9B is similar in many respects to the MIMO system 10 shown in Fig. 9A, 
except the noise reduction filter 82 shown in Fig. 9A has been replaced by a plurality of noise reduction filters 182 A, 
182B, 182C, and 182D. In Fig. 9B, the noise reduction filters 182A, 182B, 182C, 182D are placed in the echo-cancelled 

35 near-end voice signal lines 166A, 166B and the echo-cancelled far-end voice signal lines 1 76A and 1 76B, respectively. 
In addition to improving the clarity of the telephone input signal, Tx^, this implementation also removes the back- 
ground noise in the voice signals themselves. Noise reduction filter 182A removes the background noise in the first 
echo-cancelled near-end voice signal line 166 A, noise reduction filter 182D removes the background noise in the sec- 
ond echo-cancelled near-end voice signal line 166B, noise reduction filter 182B removes the background noise in the 

40 first echo-cancelled far-end voice line 176A, and noise reduction filter 182C removes the background noise in the sec- 
ond echo-cancelled far-end voice line 176B, therefore preventing the rebroadcasting of noise on the pair of near-end 
loudspeakers 24, 26 and the pair of far-end loudspeakers 42, 44, respectively. In other respects, the MIMO system 10A 
shown in Fig. 9B is similar to the MIMO system 10 shown in Fig. 9A. 

[0057] Fig. 10 is a state diagram illustrating the operation of the telephone steering switch 80 A in Figs. 9A and 9B. 

45 The idle state 194 indicates that none of the microphones 20, 22, 38, 40 are generating a voice signal having a sound 
level exceeding the threshold switching value 66, Fig. 2A. In Fig. 10, state 196 indicates that the first near-end micro- 
phone 20 labelled as MICt y is the designated primary microphone. State 1 98 indicates that the second near-end micro- 
phone 22 labelled as MIC 12 is the designated primary microphone. State 200 indicates that the first far-end microphone 
38 labeled as MIC 2 i is the designated primary microphone. State 202 indicates that the second far-end microphone 40 

50 labelled as MIC 2 2 is the designated primary microphone. Lines 196 A, 198A, 200A, and 202A illustrate that when the 
system is in the idle state 194, the system designates the first microphone to have a voice signal with a sound level 
exceeding the threshold switching value 66, Fig. 2A, as the designated primary microphone. Lines 196B, 198B, 200B 
and 202B indicate that the designated primary microphone will enter the fade-out state 204 after expiration of a holding 
time period t H , and fade-out from the "on" state to the "off" state, as long as no other microphone is requesting priority 

55 to be the designated primary microphone. Line 206 from the fade-out state 204 to the idle state 1 94 indicates that the 
system enters the idle state 194 once the fade-out state 204 is completed. The cross-fade state 208 illustrates that the 
designated primary microphone cross-fades from the "on" state to the "off" state when one of the other microphones 
gains priority to become the designated primary microphone. It is desirable that the three microphones which are not 
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designated as the primary microphone compete among each other to determine which of the three other microphones 
may request priority to become the designated primary microphone. Such a competition can occur in various ways, but 
preferably the microphone signal having the highest sound level determined via round-robin is designated as the priority 
requesting microphone. Otherwise, cross-fading is preferably implemented in accordance with the cross-fading 

5 described in Figs. 6A and 6B. 

[0058] As with the SISO systems in Fig. 3 A and 3B, it is desirable that the raw telephone input signal in fine 11 6 be a 
combination of 100% of the designated primary microphone signal and approximately 20% of the microphone signals 
of microphones in the "off" state. In some vehicles, it may be desirable to lower the percentage of microphone signal 
transmitted from microphones in the "off" state. In any event, the MIMO system shown in Figs. 9A, 9B and 10 has more 

w microphones than the SISO systems shown in Figs. 3A and 3B, and therefore noise reduction filtering, block 82 in Fig. 
9A and blocks 182A, 182B, 182C, 182D in Fig. 9B, is extremely desirable so that an intelligible, noise-reduced tele- 
phone input signal Tx^, is transmitted to the cellular telephone 58. In addition, the system 10 shown in Fig. 9A and the 
system 10A shown in Fig. 9B can also include privacy switches (not shown) similar to privacy switches 110 and 112 
shown in the system 78 in Figs. 3A and 3B. 

75 [0059] Fig. 1 1 is a state diagram showing the operation of the far-end voice enhancement steering switch 1 68A and 
the near-end voice enhancement steering switch 168B. In Fig. 11 as in Fig. 10, the first near-end microphone 20 is 
labelled MIC^, the second near-end microphone 22 is labelled MIC 12 . the first far-end microphone 38 is labelled 
MIC 21 , and the second far-end microphone 40 is labelled MIC 22 . In general, the far-end voice enhancement steering 
switch 168A designates either the first near-end microphone 20 labelled MIC n or the second near-end microphone 22 

20 labelled MIC 12 as a primary near-end microphone. If neither of the near-end microphones MIC^ or MIC 12 have a sound 
level exceeding the threshold switching value 66, Rg. 2A, the far-end voice enhancement steering switch 168A resides 
in the idle state 21 0. If the steering switch 1 68 is in the idle state and either of the near-end microphones MIC 1 ^ or MIC 12 
has a sound level exceeding the threshold switching value 66, Fig. 2 A, the steering switch 168 switches to the respec- 
tive state 212 or 214 as indicated by lines 212A and 214A. The far-end voice enhancement input signal in line 180A is 

25 a combination of the microphone signals from MIC^ and MIC 12 with the designated primary microphone having 100% 
of the microphone output combined with approximately 1 %-1 0% of the microphone output of the other near-end micro- 
phone. Note that the percentage of transmission of the microphone output signal from the microphone not designated 
as the primary microphone is preferably less than the same with respect to the telephone steering switch, for example 
80A in Figs. 9A and 9B. With the telephone steering switch 80A, it is desirable that the raw telephone input signal have 

30 a substantial sound level especially when speech is not present so that the line does not appear dead to a listener on 
the other end of the line on the telephone. In contrast, it is not necessary or even desirable for the far-end voice 
enhancement input signal in line 180A to have a detectable amount of background noise present within the signal, even 
when speech is not present. Therefore, only a small percentage, preferably undetectable by a driver and/or passenger 
within the vehicle, is transmitted as part of the far-end voice enhancement input signal 180A. It is desirable, however, 

35 that a small percentage of the microphone output be transmitted so that microphones in the "off" state do not click on 
and off, which would be annoying to the driver and/or passengers within the vehicle. The far-end voice enhancement 
steering switch 168A also includes a fade-out state 216 and a cross-fade state 218 which operate substantially as 
described with respect to Figs. 4-7. 

[0060] The near-end voice enhancement steering switch 168B operates preferably in a similar manner to the far-end 
40 voice enhancement 168A. The near-end voice enhancement switch 168B includes an idle state 220 in which the micro- 
phone output from both the first far-end microphone 38 labelled as MIC 21 and the second far-end microphone 40 
labelled as MIC 22 have microphone output with a sound level below the threshold switching value 66, Fig. 2A. State 222 
labelled MIC 21 indicates a state in which the first far-end microphone 38 is designated as the primary microphone. State 
224 labelled MIC 22 represents the state in which the second far-end microphone 40 is designated as the primary micro- 
45 phone. The near-end voice enhancement steering switch 168B also includes a fade-out state 226 and a cross-fade 
state 228 which operate in a similar manner as described with respect to the far-end voice enhancement steering switch 
168A and the telephone steering switch 80 described in Figs. 4-7. As with the far-end voice enhancement steering 
switch 1 68 A, the near-end voice enhancement steering switch 1 68 B outputs the near-end voice enhancement input sig- 
nal in line 180B which is a combination of 100% of the designated primary microphone 222 or 224 and preferably 1%- 
so 1 0% of the other microphone 24 or 22, respectively. 

[0061] The invention has been described in accordance with a preferred embodiment of carrying out the invention, 
however, the scope of the following claims should not be limited thereto. Various modifications, alternatives or equiva- 
lents may be apparent to those skilled in the art, and the following claims should be interpreted to cover such modifica- 
tions, alternatives and equivalents. 

55 

Claims 

1 . An integrated vehicle voice enhancement system and hands-free cellular telephone system comprising: 
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a near-end acoustic zone; 
a far-end acoustic zone; 

a near-end microphone that senses sound in the near-end zone and generates a near-end voice signal; 
a far-end microphone that senses sound in the far-end zone and generates a far-end voice signal; 
. a near-end loudspeaker that inputs a near-end input signal and outputs sound into the near-end zone; 
a far-end loudspeaker that inputs a far-end input signal and outputs sound into the far-end zone; 
a near-end adaptive acoustic echo canceler that receives the near-end input signal and generates a near-end 
echo cancellation signal; 

a near-end echo cancellation summer that inputs the near-end voice signal and the near-end echo cancellation 
signal and outputs an echo-cancelled, near-end voice signal; 

a far-end adaptive acoustic echo canceler that receives the far-end input signal and generates a far-end echo 
cancellation signal; 

a far-end echo cancellation summer that inputs the far-end voice signal and the far-end echo cancellation sig- 
nal and outputs an echo-cancelled, far-end voice signal; 

a microphone steering switch that inputs the echo-cancelled, near-end voice signal and the echo-cancelled, 
far-end voice signal and outputs a telephone input signal; and 
a cellular telephone that inputs the telephone input signal; 

wherein at least one noise reduction filter is used to improve the clarity of the telephone input signal inputting 
the cellular telephone. 

An integrated vehicle voice enhancement system and hands-free cellular telephone system as recited in claim 1 
wherein the noise reduction filter comprises: 

a plurality of fixed filters, each fixed filter inputting the raw telephone input signal and outputting a respective 
filtered telephone input signal; 

a time-varying fitter gain element corresponding to each fixed filter that inputs the respective filtered telephone 
input signal and outputs a weighted and filtered telephone input signal; and 

a summer that inputs the weighted and filtered telephone input signals and outputs a noise-reduced telephone 
input signal. 

An integrated vehicle voice enhancement system and hands-free cellular telephone system as recited in claim 1 
comprising: 

a first noise reduction filter that inputs the raw echo-cancelled, near-end voice signal and outputs a noise- 
reduced, echo-cancelled, near-end voice signal; and 

a second noise reduction filter that inputs the raw echo-cancelled, far-end voice signal and outputs a noise- 
reduced, echo-cancelled, far-end voice signal. 

An integrated vehicle voice enhancement system and hands-free cellular telephone system as recited in claim 1 
wherein the noise reduction filter is a recursive implementation of a discrete cosine transform modified to stabilize 
its performance in a digital signal processor. 

An integrated vehicle voice enhancement system and hands-free cellular telephone system as recited in claim 4 
wherein each of the plurality of fixed filters is a finite impulse response filter. 

An integrated vehicle voice enhancement system and hands-free cellular telephone system as recited in claim 5 
wherein the finite impulse response filters are represented by the following expression: 



where M is the number of fixed filters, x(k-n) is a time-shifted version of the raw input signal, n=0,1 ...M-1 , z m (k) is 
the filtered input signal for the m m filter, m=0.1 ,...M-1 , y is a stability factor, and G m =1 for m=0, and G m =2 for m * 0. 

An integrated vehicle voice enhancement system and hands-free cellular telephone system as recited in claim 4 
wherein the plurality of fixed filters are infinite impulse response filters. 



m-1 



G 
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8. An integrated vehicle voice enhancement system and hands-free cellular telephone system as recited in claim 7 
wherein the infinite impulse response filters are represented by the following expressions: 



z 0 (k) = [~] lx[k) - y M x {k-M)] + Y z 0 



for fixed filter m=0, and 



10 



15 



z m (k) = [I cos 2 [(*(*) - y x + (-1) m y M+1 x(/c-[/W + 1]) 

- (-1) m y M x(k-M)} + 2 y cos z m (k-M - y 2 z m (*-2; 

for fixed filter m=1,2...M-1, 

where y is a stability parameter, x(k) is the raw input signal for sampling period k, M is the number of fixed filters, 
and z m (k) is the filtered input signal for the m^ filter, m=0,1...M-1. 

20 9. An integrated vehicle voice enhancement system and hands-free cellular telephone system as recited in claim 1 
wherein the noise reduction filter comprises: 

a plurality of fixed filters, each fixed filter inputting a raw input signal derived from at least one of the systems 
microphone signals and outputting a respective filtered signal; 
25 a time-varying filter gain element corresponding to each fixed filter that inputs the respective filtered signal and 

outputs a weighted and filtered signal, each time-varying filter gain element having a value that varies over time 
in proportion to a signal strength level for the respective filtered signal; and 
a summer that inputs the weighted and filtered input signals and outputs a noise reduced signal. 

30 10. An integrated vehicle voice enhancement system and hands-free cellular telephone system as recited in claim 9 
wherein the value of each time-varying filter gain element is determined in accordance with the following expres- 
sion: 



35 



where p m (k) is the value of the time-varying filter gain element for the m th fixed filter at sampling period k, 
m=0,1...M-1, SSL^k) is the speech strength level for the respective filtered telephone input signal at sampling 
40 period k, and n and a are preselected performance parameters having values greater than 0. 

11. An integrated vehicle voice enhancement system and hands-free cellular telephone system as recited in claim 10 
wherein time-varying filter gain elements p m (k) for the m m fixed filter is set equal to zero if noise power for the 
respective frequency band is greater than a preselected threshold value. 

45 

12. An integrated vehicle voice enhancement system and hands-free cellular telephone system as recited in claim 10 
wherein the performance parameter \i is approximately equal to 4 and the performance parameter a is approxi- 
mately equal to 2. 

50 13. An integrated vehicle voice enhancement system and hands-free cellular telephone system as recited in claim 10 
wherein the speech strength level for the respective filtered input signal at sample period k is determine in accord- 
ance with the following expression: 

SS, m s ^ wr mW 

where s_pwr m (k) is an estimate of combined speech and noise power in the m m filtered input signal at sample 
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period k and nj3wr m (k) is an estimate of noise power in the m* 1 tittered input signal used for sample period k 

14. An integrated vehicle voice enhancement system and hands-free cellular telephone system as recited in claim 13 
wherein the noise power level estimate n _pwr m (k), m=0, 1 ...M-1 for sample period k for each of the filtered input sig- 

s nals is accomplished in accordance with the following expression: 

n_pwr m (k) = n_pwr m (k-1) + X 0 (z m (k) * z m (k) - n_pwr m (k-1)) 

where z m (k) is the value of the respective filtered input signal at sample period k when speech is not present in the 
10 raw input signal, and Xq is a fixed time constant 

15. An integrated vehicle voice enhancement system and hands-free cellular telephone system as recited in claim 14 
wherein time constant Xq is set to a small value, thereby providing a long averaging window for estimating the noise 
power level. 

75 

16. An integrated vehicle voice enhancement system and hands-free cellular telephone system as recited in claim 13 
wherein the combined speech and noise power level s__pwr m (k), m*0,1...M-1 for sample period k for each of the 
filtered input signals is estimated in accordance with the following expression: 

20 s_pwr m (k) = s_pwr m (k-1) + X m (z m (k) * z m (k) - s_pwr m (k-1)) 

where z m (k) is the value of the respective filtered input signal at sample period k and X m is a fixed time constant for 
the estimate of the combined speech and noise power level for each respective filtered input signal. 

25 17. An integrated vehicle voice enhancement system and hands-free cellular telephone system as recited in claim 1 
wherein the microphone steering switch includes: 

means for designating one of the echo-cancelled voice signals as the primary microphone; and 
means for combining the echo-cancelled voice signals to generate the telephone input signal giving emphasis 
30 to the echo-cancelled voice signal from the primary microphone. 

18. An integrated vehicle voice enhancement system and hands-free cellular telephone system as recited in claim 1 7 
wherein the designated primary microphone is set to a "on" state and the other one or more microphones remain 
set in an "off" state, and the one or more microphones in the "off" state contribute approximately 20% of their 

35 respective microphone signals to the telephone input signal. 

19. An integrated vehicle voice enhancement system and hands-free cellular telephone system as recited in claim 1 
wherein: 

40 the cellular telephone outputs a telephone receive signal that is combined with both the near-end input signal 

and the far-end input signal. 

20. An integrated vehicle voice enhancement system and hands-free cellular telephone system as recited in claim 19 
further comprising: 

45 

a microphone privacy switch that discontinues transmission of the far-end voice signal to the microphone steer- 
ing switch when the microphone privacy switch is open; and 

a loudspeaker privacy switch that discontinues transmission of the telephone receive signal for combination 
with the far-end input signal that inputs the far-end loudspeaker when the loudspeaker privacy switch is open. 

50 

21 . A voice enhancement system comprising: 

a near-end acoustic zone; 
a far-end acoustic zone; 

5 5 a plurality of near-end microphones that each sense sound in the near-end zone and each generate a near- 

end voice signal; 

a plurality of far-end microphones that each sense sound in the far-end zone and each generate a far- end voice 
signal; 
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at least one near-end loudspeaker that inputs a near-end input signal and outputs sound into the near-end 
zone; 

at least one far-end loudspeaker that inputs a far-end input signal and outputs sound into the far-end zone; 
one or more near-end adaptive echo cancellation channels, each receiving a respective near-end input signal 
5 and outputting a near-end echo cancellation signal for an associated near-end microphone; 

a near-end echo cancellation summer for each near-end microphone that inputs the near-end voice signal from 
the respective near-end microphone and any near-end echo cancellation signal from the associated one or 
more near-end adaptive echo cancellation channels, and outputs a respective echo-cancelled, near-end voice 
signal; 

w one or more far-end adaptive echo cancellation channels, each receiving a respective far-end input signal and 

outputting a far-end echo cancellation signal for an associated far-end microphone; 
a far-end echo cancellation summer for each far-end microphone that inputs the far-end voice signal from the 
respective far-end microphone and any far-end echo cancellation signal from the associated one or more far- 
end adaptive echo cancellation channels, and outputs a respective echo-cancelled, far-end voice signal; 

is means for combining the plurality of echo-cancelled, near-end voice signals to form a near-end voice enhance- 

ment input signal which is a speech component of the far-end input signal to the far-end loudspeaker; and 
means for combining the plurality of echo-cancelled far-end voice signals to form a far-end voice enhancement 
input signal which is a speech component of the near-end input signal to the near-end loudspeaker. 

20 22. A voice enhancement system as recited in claim 21 wherein: 

said means for combining the plurality of echo-cancelled, near-end voice signals includes means for designat- 
ing one of the echo-cancelled, near-end voice signals as a primary near-end voice signal, and wherein the des- 
ignated primary near-end microphone is set to an "on" state and the one or more other near-end microphones 

25 remain set in a "off" state, and the one or more near-end microphones in the "off" state contribute less than 

10% of their respective microphone signals to the near-end voice enhancement transmit signal; and 
said means for combining the plurality of echo-cancelled, far-end voice signals includes means for designating 
one of the echo-cancelled, far-end voice signals as a primary far-end voice signal, and wherein the designated 
primary far-end microphone is set to an "on" state and the one or more other far-end microphones remain set 

30 in a "off" state, and the one or more far-end microphones in the "off" state contribute less than 10% of their 

respective microphone signals to the far-end voice enhancement transmit signal. 

23. An integrated vehicle voice enhancement system and hands-free cellular telephone system comprising: 

35 a near-end acoustic zone; 

a far-end acoustic zone; 

a plurality of near-end microphones that each sense sound in the near-end zone and each generate a near- 
end voice signal; 

a plurality of far-end microphones that each sense sound in the far-end zone and each generate a far-end voice 
40 signal; 

at least one near-end loudspeaker that inputs a near-end input signal and outputs sound into the near-end 
zone; 

at least one far-end loudspeaker that inputs a far-end input signal and outputs sound into the far-end zone; 
one or more near-end adaptive echo cancellation channels, each receiving a respective near-end input signal 
45 and outputting a near-end echo cancellation signal for an associated near-end microphone; 

a near-end echo cancellation summer for each near-end microphone that inputs the respective near-end voice 
signal from the respective near-end microphone and any near-end echo cancellation signal from the associ- 
ated one or more near-end adaptive echo cancellation channels, and outputs a respective echo-cancelled, 
near-end voice signal; 

so one or more far-end adaptive echo cancellation channels, each receiving a respective far-end input signal and 

outputting a far-end echo cancellation signal for an associated far-end microphone; 
a far-end echo cancellation summer for each far-end microphone that inputs the far-end voice signal from the 
respective far-end microphone and any far-end echo cancellation signal from the associated one or more far- 
end adaptive echo cancellation channels, and outputs a respective echo-cancelled, far-end voice signal; 

55 a microphone steering switch that inputs the echo-cancelled, near-end voice signals and the echo-cancelled 

far-end voice signals and outputs a telephone input signal; 
a cellular telephone that inputs the telephone input signal; 

wherein at least one noise reduction filter is used to improve the clarity of the telephone input signal inputting 
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the cellular telephone. 

24. A method of generating a noise-reduced telephone input signal in a hands-free telephone system for a vehicle, the 
method comprising the steps of: 

sensing background noise within the vehicle and driver and passenger speech within the vehicle using at least 
one microphone located within the vehicle, and generating an input signal in response thereto; 
filtering the input signal through a plurality of M fixed filters to generate a plurality of M filtered input signals, the 
fixed filters being a recursive implementation of a discrete cosine transform modified to stabilize its perform- 
ance on a digital signal processor; 

estimating a noise power level for each of the M filtered input signals; 

estimating a combined speech and noise power level of each of the M filtered input signals; 

weighting each of the plurality of M filtered input signals by a respective time-varying filter gain p m which is 

determined in accordance with the respective estimate of the combined speech and noise power level and the 

estimate of the noise power level; and 

combining the M weighted and filtered input signals to form a noise-reduced input signal. 

25. A method as recited in claim 24 wherein estimating the noise power level is first accomplished after system start- 
up before a noise-reduced telephone input signal is transmitted to a telephone, and the method further comprises 



monitoring whether speech is present in the raw input signal; and 

periodically estimating the noise power level after a noise-reduced input signal has been transmitted to the tel- 
ephone when speech is not present in the raw input signal. 

26. A method as recited in claim 24 wherein: 

the noise power level estimate for sample period kfor each of the M filtered input signals n_pwr n (k), m=0,1...M- 
1 , is accomplished in accordance with the following expression: 



where z m (k) is the value of the respective filtered input signal at sample period k when speech is not present 
in the raw input signal, and is a fixed time constant. 

27. An integrated vehicle voice enhancement system and hands-free cellular telephone system as recited in claim 26 
wherein time-varying filter gain elements p m (k) for the m* 1 fixed filter is set equal to zero if noise power for the 
respective frequency band is greater than a preselected threshold value. 

28. A method as recited in claim 26 wherein the time constant Xq is set to a small value, thereby providing a long aver- 
aging window for estimating the noise power level n_pwr m (k). 

29. A method as recited in claim 26 wherein the combined speech and noise power level for sample period k for each 
of the M filtered input signals, s_ pwr m (k), m=0,1...M-1, is accomplished in accordance with the following expres- 



where z m (k) is the value of the respective filtered input signal at sample period k, and X m is a fixed time constant 
for the combined speech and noise power level estimate for each of the M fixed filters. 

30. A method as recited in claim 29 wherein the M time-varying filter gains p m (k) are determined in accordance with 
the following expressions: 



the steps of: 



n_pwr m (k) = n_pwr m (k-1) + X 0 (z m (k) * z m (k) - n_pwr m (k-1)) 



sion: 



s_pwr m (k) = sj3wr m (k-1) + X m (z m (k) *z m (k) - sj>wr m (k-1)) 
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SSL tk)- * jmrmlk) 
bbL mW' n_pwrjk) 



where a, n 2> 0 are performance parameters, and SSLJk) is the speech strength level for the m th filtered input sig- 
nal at sample period (k). 

31 . A method as recited in claim 24 wherein at least two microphones are used to sense background noise and driver 
and passenger speech within the vehicle, each microphone generating a voice signal that is combined with at least 
one echo cancellation signal to generate an echo-cancelled voice signal, and the method further comprises the 
steps of: 

selecting one of the echo-cancelled, voice signals as a primary voice signal and generating the raw input signal 
by combining the echo-cancelled voice signals giving emphasis to the echo-cancelled voice signal that was 
selected as the primary voice signal. 

32. A method as recited in claim 24 wherein the plurality of fixed filters are infinite impulse response filters represented 
by the following expressions: 

z 0 {k) =[ 1] l(x(k) - y m x (k-M)] + y z 0 (/c-1) 

for m=0 

z m W- gcos 2 gSl] [(*(*)- y *(/c-1) + (-1) m Y M+1 *(/c-[/W + H) 
■ (-1) m y M x(k-M)] + 2 y cos ) - y 2 z m (k-2) 



for m=1,2...M-1 

where y is a preselected stability parameter, x(k) is the raw input signal for sample period k, and z m is the filtered 
input signal for the m 01 fixed filter m=0,1 ...M-1. 

33. A method of voice activated switching in a hands-free cellular telephone system for a vehicle, the method compris- 
ing the steps of: 

sensing background noise within the vehicle and driver and passenger speech within the vehicle using a plu- 
rality of microphones and generating a plurality of respective microphone signals in response thereto; 
determining the magnitude of each respective microphone signal: 

maintaining microphone output for each microphone in an "off state if the magnitude of the respective micro- 
phone signal is below a threshold switching value: 

if the microphone output for none of the microphones is in an "on" state, designating one of the microphone 
signals as a primary microphone signal when the magnitude of one of the microphone signals exceeds the 
threshold switching value and switching the microphone output for the respective primary microphone to an 
"on" state; 

holding the microphone output for the designated primary microphone in the "on" state for a holding period 
after the magnitude of the primary microphone signal falls below the threshold switching value; 
whenever a primary microphone signal has been designated, determining whether any of the microphone sig- 
nals from any of the other microphones exceeds the threshold switching value, and if so, comparing the mag- 
nitude of any such microphone signals to designate one of the microphone signals as a priority switching 
microphone signal; 

if no microphone signal is designated as a priority switching microphone signal, fading out the microphone out- 
put for the designated primary microphone from the "on" state to the "off" state for a fading out period after expi- 
ration of the holding period, thereby terminating the "on" state of the microphone output for the designated 
primary microphone at the beginning of the fading out period; and 

rf a microphone signal is designated as a priority switching microphone signal, cross-fading the microphone 
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output for the designated primary microphone from the "on" state to the "off" state and the microphone output 
of the designated priority switching microphone from the "off" state to the "on" state if the magnitude of the 
microphone signal from the primary microphone does not exceed the threshold switching value before the ter- 
mination of the holding period, thereby rendering the priority switching microphone as the designated primary 
s microphone. 

34. A method as recited in claim 33 further comprising the step of: 

if a microphone signal is designated as a priority switching microphone signal, cross-fading the microphone 
10 output for the designated primary microphone from the "on" state to the "off" state and the microphone output 

for the designated priority switching microphone from the "off" state to the "on" state if the magnitude of the pri- 
ority switching microphone signal exceeds the magnitude of the primary microphone signal by a cross-fading 
threshold value, thereby rendering the priority switching microphone as the designated primary microphone. 

is 35. A method as recited in claim 33 wherein the primary microphone signal is combined with the other microphone sig- 
nals giving emphasis to the primary microphone signals to generate a raw telephone input signal. 

36. A method as recited in claim 35 wherein microphones having microphone output in the "off" state contribute about 
20% of their respective microphone signal to the raw telephone input signal. 

20 

37. A method as recited in claim 35 wherein microphones having microphone output in the "off" state contribute about 
10% of their respective microphone signal to generate a voice transmit signal in a voice enhancement system for 
a vehicle. 
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